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Abstract 

The stochastic-gauge representation is a method of mapping the equation of mo- 
tion for the quantum mechanical density operator onto a set of equivalent stochastic 
differential equations. One of the stochastic variables is termed the "weight" , and 
its magnitude is related to the importance of the stochastic trajectory. We inves- 
tigate the use of Monte Carlo algorithms to improve the sampling of the weighted 
trajectories and thus reduce sampling error in a simulation of quantum dynam- 
ics. The method can be applied to calculations in real time, as well as imaginary 
time for which Monte Carlo algorithms are more-commonly used. The method is 
applicable when the weight is guaranteed to be real, and we demonstrate how to 
ensure this is the case. Examples are given for the anharmonic oscillator, where 
large improvements over stochastic sampling are observed. 

Key words: quantum dynamics, Monte Carlo, Metropolis algorithm, branching 
algorithm, stochastic gauges, Bose Einstein condensation 



1 Introduction 

Complexity is a fundamental problem in theoretical physics [1]. It refers to 
the large size of many physical systems in terms of their microscopic con- 
stituents, and therefore the near impossibility to predict, from first principles, 
the detailed properties of such a system. In quantum physics the problem 
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is manifest as the enormous dimension of the appropriate Hilbcrt space for 
most reahstic physical systems. In particular, the calculation of exact quantum 
dynamics is notoriously difficult, because the dimension of the Hilbert space 
scales exponentially with the number of modes. This enormous Hilbert space 
prevents complete representation of the evolving many-body quantum state 
as a numerical state-vector for large many-body systems. Successful quantum 
simulation methods can thus only hope to sample the quantum evolution, to 
some finite precision, by use of stochastic methods. 

Such quantum Monte Carlo (QMC) techniques have a long history in first- 
principles, microscopic calculations of thermal equilibrium and ground states 
in quantum systems [2,3]. Certain classes of QMC methods, such as diffusion 
or Green's function approaches [4] (projector methods) are restricted to cal- 
culation of ground state properties. Methods based on path integrals [5] are 
a simulation through imaginary time, and can calculate correlations at any 
temperature. However, when it comes to real-time calculations (i.e. dynam- 
ics), path integral QMC methods become difficult because of sign or phase 
problems [6,7]. 

An alternative approach is provided by phase-space representations [8,9,10], 
which can be used to map quantum dynamics to a set of equivalent stochas- 
tic differential equations. The number of phase-phase equations scale polyno- 
mially with the number of modes, allowing computationally tractable simu- 
lations. Phase-space methods have proved useful in the past for simulating 
quantum dynamics, particularly in the field of quantum optics [11,12,13] A 
natural extension of these techniques is to the field of degenerate quantum 
gases, where the interacting particles are atoms or molecules rather than pho- 
tons [14,15]. Recently it has been discovered that Fermi gases can be treated 
with related techniques [16,17]. 

The mapping of a quantum problem to phase-space equations is far from 
unique. This nonuniqueness can be exploited to tailor the form the stochastic 
equation without affecting the physical, ensemble result. The different choices 
correspond to different "stochastic gauges". In this paper, we use this free- 
dom to generate stochastic equations with real weights, which we then sample 
with Monte Carlo techniques. The real weights avoid the sign or phase prob- 
lem encountered in other QMC approaches to quantum dynamics. Since the 
stochastic gauge method is a relatively new technique, we choose to focus here 
on an especially simple case with known exact solutions, in order to clarify 
the problems and advantages of this real weight approach. 

We emphasise, however, that even the simple case of the quantum anharmonic 
oscillator that we treat here has highly nontrivial behaviour when treated as 
a stochastic problem. It is also relevant to current experiments on the dy- 
namics of ultra-cold atoms trapped in optical lattices. These are described 
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rather accurately by the so-called Bose-Hubbard model, which reduces to the 
type of single-mode theory treated here in the Mott-insulator limit in which 
inter-well tunneling is suppressed. We simulate the decay of coherence due 
to phase-diffusion, a physical effect that has already been experimentally ob- 
served in recent BEC experiments [18]. The present paper focuses on this rel- 
atively straightforward and exactly soluble case, in order to demonstrate the 
important principles behind first-principles quantum dynamical simulation in 
real time. Due to the linear scaling of these methods with increasing numbers 
of modes, we expect that the same basic ideas will apply to multi-mode or 
multi-well situations where there are no known exact solutions in general. 

This paper is organised as follows. In Sec. 2 we review the stochastic-gauge 
representation [19] and motivate the use of Monte Carlo techniques for sam- 
pling the weighted stochastic trajectories. In Sec. 3 we introduce a simple 
model for interacting quantum dynamics, and describe the design of a gauge 
that results in real weights. 

In Sec. 4 we review the Metropolis-Hastings algorithm [20,21,22] and describe 
its application to stochastic-gauge simulations, demonstrating the improve- 
ments over the usual stochastic sampling for the same type of gauge. 

In Sec. 5 we describe an alternative scheme, based on stochastic sampling, but 
with a branching algorithm for efficient handling of the weights. 

Finally, in Sec. 6 we discuss the advantages and disadvantages of each Monte 
Carlo method in the context of future applications. 



2 The Stochastic-Gauge Representation 

The stochastic-gauge (or gauge-P) representation is a generalisation of the 
positive-P i+P) representation [9,10], where the density operator is expanded 
as a positive distribution over an off-diagonal, over-complete basis set of coher- 
ent states. The essential difference between the +P and the stochastic-gauge 
representation is that the later is defined over a quantum phase space with 
an additional dimension termed the weight Q, such that the total phase space 
vector is a = {cx.,f3,Q) of complex dimension 2M + 1. Throughout this pa- 
per the +P variables (a, /3) are referred to as mode variables. For a complete 
description of the method we refer the reader to Ref . [19] , however we briefly 
summarise the main features of the method below. 

The procedure for calculating quantum dynamics using the stochastic-gauge 
representation is similar to that using the -\-P representation and involves 
deriving, via a Fokker-Planck equation, a set of stochastic differential equa- 
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tions equivalent to the original quantum master equation. The most general 
quantum-dynamical evolution may be written as a master equation of the 
form 

were L is a Liouville superoperator (e.g. L[p] = —i[H, p] for unitary evolution). 
To calculate the quantum dynamics using the stochastic-gauge representation 
we expand the density operator in an over-complete basis set as 

p = y d^^+^aG{d, a*)A{d) , (2) 

where 

A(a) = n\\cx){l3*\\exp[-a- (3], 
is the kernel, G{d, a*) is the (non-unique) gauge distribution function, and 



> = E^I^), (3) 



is a Bargmann coherent state [10]. 

By use of operator identities for creation and annihilation operators acting 
on the kernel and subsequent integration by parts (with the assumption that 
boundary terms vanish), it is possible to show that any master equation involv- 
ing only two-body terms is equivalent to a gauge distribution function evolving 
according to a positive-definite Fokker-Planck equation. The dynamical mo- 
ments may therefore be obtained by evolving an equivalent set of stochastic 
differential equations (SDEs) and taking stochastic averages of an appropriate 
product of stochastic variables [23] . 

At this stage it is possible to add arbitrary terms to the Fokker-Planck differ- 
ential operator that give zero when acting on the kernel. These terms cannot 
affect the quantum averages, but add stochastic gauges — arbitrary functions 
on phase space — to the drift part of the corresponding SDEs [19]. The central 
result is easy to state. For an M mode quantum system with +P Ito equations 
of the form 

(j = + BWf, (4) 
where A^~^^ and B*-"*"^ are the positive-P drift vector and diffusion matrix, 
respectively, then the stochastic-gauge equations for the system are 

<j = lW + BW(e-^) , (5) 

n^nfC (6) 

Here g = g{d) is a vector of stochastic gauges and ^ is a vector of gaussian 
noises where 

mMt')) = ks{t - 1') . (7) 
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The stochastic gauges g can be used to modify the deterministic evolution of 
the stochastic trajectories and are therefore called drift gauges. 



Also possible are diffusion gauges which arise from the non-unique factorisa- 
tion of drift matrix D appearing in the Fokker-Planck equation into a noise 
matrix B for the SDEs, where 



Diffusion gauges can be used, for example, to "squeeze" the noise between 
stochastic variables to improve sampling [24]. More general types of diffusion 
gauge are possible in the full stochastic-gauge formalism [19,25]. 

For a single mode, quantum-dynamical averages of normally ordered products 
of creation and annihilation operators are calculated as stochastic averages in 
the following manner 



In principle any gauge that does not introduce boundary terms on partial 
integration will reproduce the exact quantum averages in the limit that an 
infinite number of trajectories are simulated, and so all gauges represent the 
same physics. However, in practice we may only simulate a finite number of 
trajectories due to limited computing resources, and so we would like to choose 
a gauge that gives rise to the most compact phase space distribution G{d,d*,t) 
possible. A narrower distribution means that fewer stochastic trajectories need 
to be sampled to obtain quantum averages with a given accuracy. The situation 
is similar to classical or quantum electrodynamics, where a judicious choice of 
gauge simplifies the solution of certain problems. 

2. 1 Monte Carlo techniques for the weight variable 

The motivation for this work is that in the stochastic gauge representation, 
stochastic trajectories are often generated with weights that vary over many 
orders of magnitude as illustrated in Fig. 1. From Eq. (9) we can see that 
the trajectories with relatively high weight contribute more to the overall 
stochastic averages than those with relatively low weight. 

The situation is analogous to path-integral calculations of quantum averages 

in imaginary time (thermal quantum averages), where quantum Monte Carlo 
techniques have long been used to sample the weight parameter more effi- 
ciently [26,4]. In the standard stochastic gauge prescription the stochastic tra- 
jectories are generated randomly. If the weight parameter can be interpreted 



D = BB^ . 



(8) 




(0/3"^a" + (QrQ;^)*),toch 

(Q -|- fi*)stoch 



(9) 
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as a probability, then this suggests that more sophisticated Monte Carlo tech- 
niques can be used to efficiently sample high- weight trajectories, thus improv- 
ing the sampling of physical averages. In this paper we focus on real-time 
dynamics, although the same techniques apply to imaginary-time stochastic 
gauge calculations [17]. Previously Monte Carlo techniques have had hmited 
success with real-time quantum dynamics, where path-integral approaches are 
plagued by sign problems due to the rapidly-oscillating phase [3,27]. 

However, there is a complication — in the stochastic-gauge representation the 
weight is complex in general. In principle it is possible to apply Monte Carlo 
techniques to problems with a complex weight — by treating the modulus 
of the weight as the 'importance'. In practice, however, sign problems are en- 
countered when the phase of the weight eventually becomes evenly distributed 
around the unit circle in the complex plane. In order to strictly interpret the 
weight as a probability distribution, we must ensure that our choice of gauge 
leads to a weight parameter that is real. This means that the gauge functions 
we introduce must also be real. 



3 Example: The Kerr Anharmonic Oscillator 



In this section we introduce a single-mode boson model for quantum nonlinear 
dynamics — the Kerr anharmonic oscillator. The Hamiltonian is 

11 11 
H — fiuJo(a^a + -) + -na^^a^ — huJo{h + -) + -nn{h — 1), (10) 

where n = a^a is the number operator. Traditionally this Hamiltonian has been 
used to describe the Kerr effect in nonlinear optics. It has received renewed 
interest recently, as it is the restriction to a single site of the Bose-Hubbard 
model. This has been shown to describe ultra-cold bosonic atoms in an optical 
lattice [28], a topic of recent theoretical and experimental interest [29,30]. 

Throughout this paper we work in the interaction picture with 

/^int = ^^a+'a". (11) 

A number state is an eigenstate of this Hamiltonian. Introducing the dimen- 
sionless time t = nt/h, a. number state evolves according to 

\n{T)) = exp(-m(n - 1)t/2) |n(0)) (12) 

Hence, any initial state can be decomposed into number states and an exact 
solution found. In particular, for an initial coherent state |'V'(0)) = \a) with a 
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Fig. 1. Illustration of the spreading of weights in a stochastic gauge simulation 
for the anharmonic oscillator. These three trajectories of O were generated using 
randomly chosen noises for a real-gauge simulation of the anharmonic oscillator 
(A = 2, X = 1/2), as described in Sec. 3.1, with mean atom number n = 100. 
The total simulation time was rtot = l/\/^ = 0.1 and the time step used was 
At = Ttot/lO^ = 10~^. Note that the relative weights of the trajectories vary in 
time so the Metropolis algorithm can only be targeted to sample the distribution at 
a single chosen time. The branching algorithm clones and kills trajectories continu- 
ously in time in proportion to their weight so as to obtain continuous-time samples 
of the distribution. 



real, the solution is 




As this model both has an exact sohition and inchides the nonlinearity that 
is a feature of models such as the Bose-Hubbard Hamiltonian, it forms an 
excellent testing ground for quantum simulation methods [31]. 

An important quantum feature of this Hamiltonian is that given an initial 
coherent state p(0) = |a)(Q;| with mean boson number n = the dynamics 
display a series of collapses and revivals. Defining the quadrature variables 

X^{a + d^)/2, f = (a-a^)/2i, (14) 

we find there are three characteristic timescales for anharmonic oscillator 
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dynamics. The quadratures initially undergo oscillations with period Tosc ~ 
0(l/n), which are damped to zero over a time of Tcou ~ 0{l/\/n). However, 
the oscillations revive at time Trev ^ Oil), which for a large mean boson 
number can be many times the collapse time. 

Because the period of oscillation of the quadratures can be very short for large 
mean atom number, and because we are most interested in the envelope of 
the collapse, we choose to perform the calculations in a rotating frame. The 
angular frequency of the rotation is equal to the mean atom number, and the 
X-quadrature, whose exact solution is 

(X(t)) = \/rfe^("°^(")-^) cos(n(sin(T) - r)) , (15) 

collapses dynamically for large mean atom number according to: 

{X{t)) ~ \/rie-^^' . (16) 

For the anharmonic oscillator Hamiltonian the stochastic-gauge Stratonovich 
SDEs are 



a — —ia'^l3 + ia/2 + \/^acosh.{A){^a — Qa) (17) 
P^i(3''a + + Vi(3sinh{A){ip - gp) (18) 

(l^^Y.9j^3- (19) 

3=a,l3 

Here A is a diffusion gauge that we choose to be constant in this paper, Qj, 
j = a, (3 are drift gauges to be chosen subsequently, and are dimensionless 
Gaussian noise terms chosen so that: 

mrMr')) = 6,,5{t - t') . (20) 

For better numerical performance we choose to work with log variables: 

e = (1/2) log(a/?), = (l/2i) log(a//?), uj = log(n). (21) 
These obey the following Stratonovich SDEs: 



e^\e-\ii-gi-i{i2-g2)), (22) 

</•= -e'' + \- \e\i, -g, + ^(6 - 52)), (23) 
u^S^+Y.93ij, (24) 
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where we have defined the hnearly transformed noises = (^^ + Cg)/V^ 
and ^2 = i^a ~ ^13) /V^, which obey the same statistics, and similarly gi = 
{da + g/3) I a/2 and (72 = [ga — dp)/ a/2- The term S^j is a Stratonovich correction 
factor which depends on the gauge choice, and will be calculated for each 
specific case. We note here that, unhke in the corresponding classical oscillator 
equations, both 9 — 9x -\- idy and (f) — (px + i<f>Y are intrinsically complex. 



3. 1 Choice of Gauge 



In this section we discuss possible choices of diffusion gauge for the anharmonic 
oscillator with a view to using Monte Carlo techniques to sample the weight. 

In Ref. [32] Drummond and Deuar investigated the following drift-gauge choice 
for the anharmonic oscillator 

= i^2 = ie^+2e^sin(2^y), (25) 

where the X and Y superscripts refer to the real and imaginary parts of 
the phase-space variable respectively (this notation is used throughout this 
paper). They found that this gauge extended simulation times by many orders 
of magnitude for the same number of stochastic trajectories. In particular they 
could simulate well past the collapse time for an initial coherent state. 

As noted in Sec. 2.1, for the Monte Carlo methods to be successful it is highly 
desirable to have a real weight. Unfortunately the above choice of gauge is 
complex and thus leads to complex weights. We have trialled the use of Monte 
Carlo techniques based on using the modulus of the weight, but found these 
to be unsuccessful due to the phase problem described in the introduction. 

In order to have real weights, we would like to design a real gauge with similar 
properties to the above complex gauge to give a similar extension of simula- 
tion time. The reason that the complex gauge Eq. (25) improves simulation 
times is that it removes a driving term from the imaginary part of the +P (j) 
equation. The presence of this term forces trajectories to diverge to infinity in 
a non-classical direction in phase-space (^y — > ±00), thus resulting in large 
sampling errors. At the same time the equations for 6 and the real part of 
6 are unaffected due to a cancellation between gi and g2. We could remove 
the offending term from the (py equation with a single real gauge g2, however 
this would necessarily appear in the 9y equation leading to an instability that 
causes poor sampling. 

Thus it seems that a drawback to choosing real gauges is that there is sub- 
stantially less control over the stochastic trajectories. This could have been 
anticipated because, in general, with a complex gauge we have as many gauge 
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degrees of freedom as we do real phase space dimensions, 4M (excluding the 
weight dimensions). However with real gauges we only have half as many gauge 
degrees of freedom, 2M, meaning that we cannot independently control the 
drift in the real and imaginary components of each of the mode variables. 

An additional source of gauge freedom is obtained from choosing a non-square 
noise matrix in deriving the SDEs. This freedom was pointed out in [19] but 
not explored. Specifically, it is possible to take the noise matrix to be of the 
general form 

B = [Bo,Q], (26) 

where Bq is a square (2M x 2M) noise matrix such that BqBq = D — QQ^, 
and Q is a 2M x W matrix whose entries arc arbitrary complex functions. 
This choice reproduces the correct moments in the limit of a large number 
of stochastic trajectories, but introduces more than the minimum number of 
noise terms into the stochastic equations. Naively this could be expected to 
lead to worse sampling errors. However, this additional gauge freedom allows 
us to overcome the restrictios of the standard real gauges and improve the 
sampling overall. 

For the anharmonic oscillator we choose 

(27) 

in the {6, 0) variables, where A could be an arbitrary complex function on 
phase space, although we choose it to be constant in our example. Here Q 
satisfies QQ^ = so the other noise terms are unaffected. In doing so we have 
added a term of the form X{^3 + i^4) to the equation of motion for 0, Eq. (23). 
An equivalent way of understanding this additional noise is that due to the 
analytic nature of the stochastic gauge kernel we have 

Thus we are free to add X^d"^ / d4>\ + d"^ / dcpy) to the Fokker-Planck differential 
operator without affecting the physical moments. 

We are now able to introduce additional gauges to the equation in the 
manner described in [19] so that the extra term in the equation becomes 

xm-gz)+i{i,-g,)). (29) 

and the gauges and g^ enter the weight equation in the same way as the 
other drift gauges. The extra noise allows us to control the (py divergence using 
only a real gauge, without affecting the 6y equation. Specifically we choose 



Q 




A iX 
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91^92^ 93^0, (30) 
^4 = ^sin(2^y). (31) 



The final SDEs that we focus on samphng for the rest of this paper are sum- 
marised as 



^=^e-^(ei-^6), (32) 
^^-e^^^ cos(2^y) + ^ 

^(e^(6 + i6)-2A(e3 + ie4)), (33) 



uj = S^-^sm{2eY)U, (34) 

where S^j — —e^^^ sin^(2^y)/2A^ is the Stratonovich correction in the weight 
equation. Note that although we choose = it is still necessary to include 
the noise ^3 for the mapping to be exact. 

Of course there are other possible choices of real gauges, but we have found 
this combination to be well-suited to illustrating the improvements possible 
with Monte Carlo sampling. 

In Fig. 2 we illustrate stochastic samphng of the above SDEs. We begin with 
an initial coherent state with mean atom number n — 100, and simulate to a 
final time of r = 1/ \/fi = 0.1, which is of the order of the collapse time. Clearly 
the sampling of the solution is only accurate for short times. In particular, the 
sampled (f2)stoch decays towards zero when it should remain at one for all 
times in a simulation of unitary dynamics. The error in sampling the mean 
atom number, which should remain at 100, is due almostly entirely to poor 
sampling of the weights, as can be seen by the almost identical decay. Even 
worse, the estimated error in the means is small despite the fact that they are 
clearly far from the analytic result. This poor sampling occurs more quickly 
on the scale of the collapse for larger mean atom number. 

This behaviour can be understood by considering the nature of the distribution 
of weights. We find that the weight parameter evolves with time towards 
zero for most trajectories. However, we know that the mean weight must 
be one, and so there must exist a small number of trajectories with large 
weights. Thus the distribution of weights seems to be skewed towards zero 
with a long tail extending to large weights. Such a distribution is difficult to 
sample, so it is not surprising that we underestimate the errors in the means 
as these assume Gaussian statistics. Fortunately, Monte Carlo techniques are 
particularly successful in sampling skewed distributions. Below we describe the 
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Fig. 2. Real-gauge simulation of anharmonic oscillator dynamics for an initial 
coherent state with mean atom number n = 100 {A = 2, A = 1/2). (a) Mean 
weight, (b) mean atom number (corresponding stochastic variable n = a(3) and (c) 
mean X— quadrature (corresponding stochastic variable X = {a + (3)/2). Averages 
were carried out over 10^ stochastic trajectories. The total simulation time was 
'Ttot = = 0.1 and the time step used was Ar = rtot/10^ = 10~^. The shaded 

region indicates the estimated error in the simulation and the stochastic means are 
in the centre of the region. The analytic results arc shown as dashed lines. The 
sampling is poor except for short times ll2ie to the skewed weight distribution. 



application of the Metropolis-Hastings algorithm and a Monte Carlo branching 
algorithm to our SDEs for the anharmonic oscillator. 



4 The Metropolis-Hcistings Algorithm 

The Metropolis-Hastings algorithm is a well-known technique for generating 
samples of a mult i- dimensional probability distribution. Here we briefly de- 
scribe the algorithm and discuss how to estimate errors in quantities derived 
from such samples. In particular this may be useful for readers from the ultra- 
cold matter community who may be unfamiliar with these methods. 

4.1 Algorithm 

The Metropolis algorithm was first described in 1953 [20] and was used to 
sample a thermal Boltzmann distribution, which can be viewed as a probabil- 
ity distribution over the space of all thermally accessible states of a system. 
Subsequently the Metropolis algorithm has been generalised to many types of 
probability distributions/densities over both discrete and continuous spaces, 
and there is a vast amount of mathematical literature on Markov Chain Monte 
Carlo (MCMC) techniques (see e.g. [6,7].) 

In this section we summarise the Metropolis algorithm in a more general form 
due to Hastings [21]. A well- written introduction to the Metropohs-Hastings 
algorithm may be found in [22], and here we follow their notation. Formally, 
the Metropolis algorithm is a MCMC technique for efficiently sampling a prob- 
ability distribution, 7r(s), where s E S represents the state of some system, 
and S is the domain of the distribution known as the state space. To be a true 
probability distribution 7r(-) must be normalised / dsTr{s) — 1, however one 
of the virtues of the algorithm is that the functional form of the distribution 
need only be known up to a constant factor in order to apply the algorithm ^ . 

A Markov chain can be thought of as a random walk through state space 
where the probability of making a particular step depends only on the current 
location. It is a sequence of points, Si, S2, ■ ■ ■ Sn e S, where the Sj are random 
variables such that 

S2 . . . Si-i) = p{si\si-i). (35) 

^ e.g. the partition function Z = ^^^5 g--B(s)/fcsT^ where E{s) is the energy of the 
system in state s, need not be know in order to sample the Boltzmann distribution, 
where the probability that the system is in state s is given by P{s) = e~^^^^/^'^'^ / Z 
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The conditional density p{si\si-i) is known as the transition density of the 
Markov chain. If there exists an invariant density 




(36) 



that satisfies the condition of detailed balance 



p{s'\s)n*{s)^p{s\s')7r*{s'), 



(37) 



then it may be shown that the Markov chain converges to the distribution 7r(-). 
More precisely, the elements of the chain {si\i — b + 1, . . . , n} are unbiased 

samples from the distribution 7r*(-) to within some given precision, where b > 
is known as the burn-in and represents the number of steps required for the 
chain to converge to within that precision. 

The Metropolis-Hastings algorithm solves the problem of determining what 
transition density to use so as to generate a given invariant distribution. It 
is concerned with generating samples from some target density 7r(-), known 
apart from a constant factor, and does so by determining a suitable transition 
density for a Markov chain to converge to this distribution. To do this we 
require a candidate generating density, q{s,s'), where / q{s,s')ds' = 1, which 
selects the next point in the Markov chain. The acceptance probability of this 
step is 



The Metropohs-Hastings algorithm can be summarised as follows 

(1) repeat for i = 1,2, . . . ,n 

(2) generate s' from q{si, •) and a uniform random variable u between and 
1. 

(3) ii u < a{si, s') then set Sj+i = s'; 

(4) else set Si+i = Si, 

(5) return {si\i — 1,2, . . . ,n} 

The Metropohs-Hastings algorithm generates a Markov chain in the state 
space with a transition density Pmhis'\s) — q{s, s')a{s, s'). When a proposed 
move is rejected the chain remains where it is. This choice of transition den- 
sity ensures that the condition of detailed balance is satisfied and hence the 
Markov chain converges to the required target density. The algorithm gener- 
ates a Markov chain that finds regions in state space where the probability 
distribution is peaked and samples these region with the correct frequency. 

The efficiency of the algorithm is dependent on the choice of candidate gen- 
erating function. Indeed the optimal choice of q{s, s') for a particular target 
density remains an active area of research today. An important special case 




(38) 
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occurs if the candidate generating distribution is symmetric, q{s, s') — q{s', s), 
and so 



This was Metropohs's original formulation which was generahsed to the case 
of an asymmetric candidate generating function in 1970 by Hastings [21]. 

4-2 Estimating sampling error 

In general we wish to determine the weighted average over all possible states 
of some observable of the system 



Because the Metropohs algorithm produces a set samples of the distribution 
7r(-), {si\i — b + 1 . . .b + n}, such averages can be estimated as 



The statistical uncertainty in such a quantity can be difficult to estimate 
because the samples are generally correlated with one another, as the samples 
produced by the Metropolis algorithm are not independent. One way to obtain 
less correlated samples is to only take every g*^ point in the chain after the 
burn-in for the purpose of calculating averages, where g is known is the gap. 
However, this does not usually produce more accurate estimates of averages 
than could have been obtained by simply taking every sample after the burn- 
in. 

Alternatively, one can accept the fact that the samples are correlated, but 
attempt to account for the correlations in some quantitative way. A scheme 
for estimating errors in means calculated from correlated Monte Carlo data 
may be found in [33]. 

For large dimensional state spaces one may still not be satisfied with estimat- 
ing errors from a single Markov chain, even if the correlations are accounted 
for. An example is the sampling of probability density with multiple peaks, 
which may not individually give correct averages for state-space dependent 
quantities. In this case, the simplest procedure is to run multiple Markov 
chains of the same type but with different starting points, and treat the av- 
erages obtained from each chain as samples of the mean of the quantity of 
interest. The overall mean is calculated by 




(39) 





n 



(40) 



N 



(41) 
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where (0)j, i = 1 . . . N are the means from the independent Markov chains as 
in Eq. (40), and the subscript vr has been omitted for clarity. By the central 
limit theorem one would expect samples obtained in this way to approach a 
Gaussian distribution about the true mean. Therefore an estimate of the error 
in the mean, for N independent Markov chains is 



\ 



Ef=i [{Oh - (Q)J /N 

N-1 



(42) 



where A[-] denotes the error in the mean of the quantity of interest. This 
notation for error in the mean is used throughout this paper when referring 
to means obtained from Metropolis data. 



4-3 Metropolis-Hastings Algorithm for the Stochastic Gauge Formalism 



The appearance of the weight Q as a multiplicative factor in the stochastic av- 
erages in Eq. (9) suggests an interpretation as a probability distribution. In this 
section we show that it is possible to interpret the weight as a probability dis- 
tribution over the space of all noise, and hence apply the Metropolis-Hastings 
algorithm. 

For the purposes of computer simulation, time is necessarily discretised. To 
simulate the time evolution of a system for a period T we divide the time 
domain into N + 1 points so that each step forward in time is of length 
At = T/N. For an M mode system with the standard set of drift gauges 
(i.e. without the extra gauges and noises discussed in Sec. 3.1) we require 2M 
Gaussian distributed random numbers for each step forward in time, and thus 
2MN random numbers to evolve an entire trajectory. Hence our fundamental 
object is a 2MN- component vector of Gaussian distributed random numbers 
w e R^-^-^, called the noise vector. The different reahsations of w give rise 
to different stochastic trajectories. The values of all stochastic phase space 
variables after N time steps are thus functions of w 

a = a[w], = Q[w]. 

The stochastic average of some some quantity, (O)stoch is 

(0).toch = / d'^'^wPiwMw) = hm 
where the Wi are drawn from a multi-dimensional Gaussian normal distribu- 
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tion' 



and O = 0[w] is some quantity depending on the noise. 

Hence the stochastic samphng of moments in stochastic gauge simulations, 
Eq. (9), is the samphng of a multi-dimensional integral 

(fi(^)0„„(^)),toch. = / d'''''wP{w)Q{w)0^n{w), (44) 

where 0,rin{w) = P[w]'"a[w]" + {P[w]''a[w]'^)* . 

The Metropolis-Hastings algorithm can be applied by identifying the state 
space S as R^-^-^ and the state s as the vector of noises w. The probability 
distribution we wish to sample is 

= , (45) 

where Af = J d'^'^'^wP{w)Q[w] is a normalisation constant. The Metropolis 
algorithm can be used to generate a set of samples of 7r(w), {wi\i = 1 . . . n} 
and estimate quantum averages as 

mn 

{[a^) a")QM = — ^ — = — ^ — c^— ^ . (46) 

None of these steps require explicit knowledge of the normalisation constant 
J\f, although in real-time unitary calculations we know analytically that it 
should always be one. Typically we run multiple Markov chains as described 
in Sec. 4 to obtain means and standard-deviation error estimates, {Omn)n 

For comparison, ordinary stochastic sampling uses a Gaussian normal random 
number generator to generate samples, {wi\i — 1 . . -N}, of P{w) and calcu- 
lates quantum averages as 

{[a'j a )qm ^ ■ 



Finally we note that the Metropohs-Hastings algorithm as outhned above only 
optimises the stochastic sampling of moments at the final target time. At ear- 
lier times the probability distribution 7i{w) (where w is the noise vector w 
truncated at the point corresponding to the earlier time) is different in that 

^ Other noise distributions are possible as the central limit theorem ensures Gaus- 
sian statistics in the limit of infinitesimal step size — however we use Gaussian 
statistics here. 
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t:{w')/t:{w) ^ 'k{w')/'k{w)^ and so we should not expect the Metropohs algo- 
rithm targeted to the later time to correctly sample the moments at earlier 
times. This is also illustrated in Fig. 1 where the relative weights vary consid- 
erably as time progresses. 



4.4 Designing a Candidate Generating Function 



The choice of candidate generating function q{w, w') is a subtle problem and 
perhaps the most crucial element of applying the Metropolis-Hastings algo- 
rithm to the stochastic gauge formalism. The Metropolis algorithm generates 
a Markov chain in noise space, where at each step the candidate generating 
function proposes a new noise w' for the stochastic trajectory which is then 
accepted or rejected based on the acceptance probability defined by Eq. (38) 

-/X . A Ti{w')q{w' ,w)\ . ( P{w')Vt{w')q{w\w)\ 
a\w, w ) — mm 1, — ^ = mm 1, ' 



TT{w)q{w,w') J \ ' P{^^)^l{w)q{w,w') J 

The evaluation of the weight at the target time for the proposed noise Q{w') is 
a non-local procedure; even if only a single noise is altered in the time domain 
the entire trajectory from that time on has to be evolved until the target time 
in order to evaluate the new weight. 

To separate the issue of sampling high weight trajectories from the Gaussian 
nature of the noise it is advisable to choose a candidate generating function 
such that 

q{w,w') ^ P{w) 

q{w,w') P{w')' ^ ' 

For this class of generating functions, the probability of a move being accepted 
is 

^ ' ' 7r{w)q{w,w') fl{w)P{w)q{w,w') fl{w)' ^ ^ 
which is only dependent on the weight rather than the Gaussian distribution. 
Pi-)- 



4-5 Time and frequency domain noise functions 



A simple generating function of this type is selecting a number of entries in 
the current noise vector and generating new noises for these. This procedure 

can be carried out in either the time or the frequency domain of the noise. If 
the noises arc altered in the time domain the stochastic trajectory is clearly 
unaltered up until the point of the first change. If noises are altered in the 
frequency domain then every noise in the time domain is affected to some 
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extent. We consider altering noises in the frequency domain to be a more 
natural "small-step" for the Metropolis algorithm as all time-domain noises 
are changed by a small amount on average. All results presented in this paper 
use a generating function operating in the frequency domain. 

In more detail, consider the R independent noise vectors of length A^, Wj G 
M^,j = 1 . . . R, separately, instead of a single noise vector of length RN. 
Insight into strategies for altering noises may be gained by taking the discrete 
Fourier transform of each noise vector 

Kj{n)^j:e'^^'^-/^w,{k). (49) 

it=0 

Here a subscript is used to denote each of the R noise vectors, and should not 
be confused with the notation Wj used previously to denote the sequence of 
noise vectors in a Markov chain. Brackets (•) denote the components of the 
noise/spectrum vectors. The n = component of K is the mean of the noise, 
and the fact that the noise is real places constraints on the components. Let 
J\f{fi, a) denote a Gaussian distribution of mean /i and standard deviation a. 
The Fourier transform of a vector whose components are normally distributed 
real variables {w{n) distributed as A/'(0, 1)) is complex. For even the real and 
imaginary parts of each component in the range 2 < n < A" — 1 are distributed 
as A^(0, and subject to the constraint K{N — n) — K{n)*, while the 

n—1 and n — N components are real and distributed as ^/'(O, VN). For odd 
A^, the real and imaginary parts of each component in the range 2 <n < N are 
distributed as A/'(0, and subject to the constraint K{N — n) = K{n)*, 

while only the n—1 component is real and distributed as A/'(0, V^)- These 
constraints preserve the total number of independent components A". 

It is informative to study an individual trajectory and alter noise elements 
individually in frequency space to gauge the effect on the final weight. One 
might expect that the low frequency noise should be more important than 
the high frequency noise, as high frequency noise should "average out" over a 
shorter time scale and thus not affect the dynamics as much. We find that this 
is true and Fig. 3 quantifies the potential of frequency components of the noise 
to affect the final value of the weight. To obtain this graph four random noise 
vectors were generated and the corresponding stochastic trajectory evolved 
using real gauge discussed in Sec. 3.1 with mean atom number, n = 100, 
and diffusion gauge A = 2. Using this initial trajectory as a starting point, 
a particular component of the noise spectrum was examined for its effect 
on the final weight by choosing 10^ random values for that component and 
evolving the corresponding stochastic trajectory, with all other components 
of the noise spectrum the same. In Fig. 3 the standard deviation of the final 
weights obtained in this manner is plotted on a log scale against the spectrum 
component number that was altered. It is clear from this figure that the low 
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frequency noises have greater potential to affect the final weight. This trend 
is independent of the initial trajectory, however we note the actual values of 
the standard deviations obtained vary considerably depending on the initial 
trajectory. 

Another question to consider is how many sites in the frequency domain to 
alter in proposing a new noise vector, and how to select these. As the low- 
frequency noise seems to affect the final value of the weight more than the 
high-frequency noise, one might be tempted to consider strategies where low- 
frequency noise is altered more often so as to more effectively explore the noise 
space. In practice we found that such strategies were no more effective than 
selecting sites randomly. Typically we chose to alter of the order 1 — 10% of 
the total components of the noise. A good guiding principle is that Metropolis 
sampling is thought to be most efficient when approximately 50% of proposed 
moves are accepted during sampling [6,7]. We therefore experimented with 
different percentages with this principle in mind. 



4-6 Results 

We now present the results of a Metropolis sampling of the real-time dynamics 
of the anharmonic oscillator using the real gauge discussed in Sec. 3.1 and 
the candidate generating functions discussed in Sec. 4.4. To allow for the 
distribution 7r(w) = P{w)Q{w)/J\f to be multiply-peaked we use the more 
robust technique of estimating means and errors using multiple Markov chains 
as discussed in Sec. 4. 

Figure 4 shows the Metropolis sampling of (n) = {a^d) and {X) = {{d + d^)/2) 
for a real-gauge simulation of the anharmonic oscillator with ri = 100. We tar- 
geted the Metropohs sampling to 20 time points in intervals of Ar = 0.005 
from to the final time of r = T = = 0.1. The average at each 

time point is completely independent of the other time points. Altogether 
n = 10® samples were used for the average at each point so statistically the 
samphng is comparable to the stochastic sampling in Fig. 2. However because 
the Metropolis sampling has to be run independently for each time point and 
each Markov chain has to burn in before sampling begins, there is clearly a 
much greater computational effort required to obtain the Metropolis results. 
Nevertheless the Metropolis results are far more reliable than the stochastic 
results — at most time points the average {X) is correct to within the esti- 
mated error. In contrast, the stochastic sampling exhibited systematic errors 
that were not accounted for by appropriately-large error bars. 

We note that even with the Metroplis algorithm the sampled mean atom num- 
ber appears to decay slightly at longer times. Although the results are still ac- 
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Fig. 3. Illustration of the effect of varying frequency components of the noise vectors 
on the final value of the weight. An initial stochastic trajectory from a real-gauge 
simulation of the anharmonic oscillator with mean atom number n = 100 was 
taken as a starting point. The value of the weight at the end of the trajectory, 
T = 0.5/-v/H = 0.05, was Q = 0.9936. The time step was Ar = lO^'^ so each noise 
vector contained 500 components. In frequency space each noise vector contains 250 
complex components. A new random noise was chosen for a particular component of 
the noise vector, the entire trajectory re-evolved and the new final weight recorded. 
This procedure was repeated 10^ times for each frequency component and the vari- 
ance in the final weight, cr{0,), calculated. This variance is plotted versus frequency 
component to give a measure of the potential of each frequency component to affect 
the final weight. Clearly low-frequency noise has more effect than high-frequency 
noise. 



curate to within 1 % the estimated error bars do not account for the difference 
from the analytic result. This behaviour, which is seen for the branching algo- 
rithm is well, is due to the inherent difficulty in sampling the skewed weight 
distribution even with Monte Carlo algorithms. The results are nonetheless 
vastly improved compared to stochastic sampling 

We experimented with different distributions of the 10^ allowed samples be- 
tween different numbers of Markov chains. There is a trade-off between number 

of Markov chains and the length of the chain — if longer chains are run (e.g. 
10^ chains each with 10"^ samples) then the means obtained from each chain 
are more accurate, and closer to Gaussian about the true mean. However, 
there are less chains to average over compared with a run of a larger number 
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Fig. 4. Metropolis sampling of real-gauge simulation of anharmonic oscillator dy- 
namics for an initial coherent state with mean atom number n = 100 {A = 2, 
A = 1/2). (a) Mean atom number, (b) mean X— quadrature. Metropolis aver- 
ages and error estimates (crosses with error bars) were calculated from the means 
of 10^ Markov chains, each taking 10^ samples after a burnin of 10^ proposals. 
The total simulation time was T = Xjyfft = 0.1 and the time step used was 
At = r/10^ = 10"*^. Proposals were generated by uniformly selecting 10% of 
the components of the noise vectors in the frequency domain and generating new 
random noise for those sites. The analytic results are shown as dotted green lines. 



of shorter chains (e.g. 10^ chains each with 10^ samples). It was a matter of 
empirical observation to determine the best balance. 

We also experimented with the fraction of noises to be altered in the frequency 
domain when proposing a new noise vector. As a rough guide we aimed to have 
50% of proposed moves accepted during sampling. However at short times we 
found that even if a very large fraction of the noises were altered (e.g. 50% or 
more) a large fraction of proposed changes were accepted (~ 90%). This large 
acceptance rate is due to the distribution of weights being relatively narrow at 
short times. At longer times the distribution has spread out sufficiently such 
that lower acceptance rates are possible. 

Clearly there are many parameters to be optimised in Metropolis sampling 
of stochastic gauge equations and we have only scratched the surface. We 
have aimed to present conceptually-simple approaches to illustrate the princi- 
ple rather than exhaustively optimise all parameters. The fact that enormous 
improvements were seen over stochastic sampling — even with our very sim- 
ple Metropolis schemes — gives us confidence that further improvements are 
possible. 
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5 Branching Algorithm 



The second Monte Carlo technique that we investigate for real-time stochastic- 
gauge simulations is a branching algorithm similar to that used in Green's 
function Monte Carlo, see e.g. [34]. Corney and Drummond [17] have previ- 
ously used this algorithm for stochastic simulations in imaginary time using a 
Gaussian basis (a generalisation of the stochastic-gauge basis). 

The branching algorithm is simpler to describe and more straightforward to 
apply to stochastic gauge simulations. Another advantage is that there are 
fewer free parameters than the Metropolis algorithm. The branching algorithm 
works by concurrently evolving a "population" of stochastic trajectories in 
time, and periodically cloning those that acquire a large weight and killing 
those that acquire a small weight. 



5.1 Algorithm 



We define T to be the total simulation time, Arb be the time interval between 
branching events and At the fundamental time-step for integrating the SDEs. 
In practice it is desirable for the number of branching events B — T / At^^ 
and the number of time steps in a branching period Arb/ At to be integers. 
Formally the branching algorithm can be stated as 

(1) Begin with an initial population of A^pop stochastic trajectories. 

(2) Evolve all stochastic trajectories forward in time for a period ATb 

(3) fori = l, 2, ...,7Vpop 

• Generate a uniform random variable u between and 1 

• Make rrii = int[fii/Q -|- u] clones of trajectory i. 

• Set Qi = 1 

(4) Set Arp„p = ^5>-m.. 

Here we set Q = {fl) to ensure that the number of trajectories in the popu- 
lation A^pop remains constant on average. Because does not couple into the 
SDEs for the mode variables, further evolution is not affected by resetting of 
the weights at each branching time. The statistical equivalence between the 
weight and the multiplicity of paths means that the physical moments are 
unchanged on average by the branching procedure. To see this note that the 
average of the mean of an observable O after the branching event is 
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which is identical to the mean before the branching event. 



5.2 Results 



In this section we present the results of a branching-algorithm sampling of real- 
time dynamics of the anharmonic oscillator using the real gauge discussed in 
Sec. 3.1. Similarly to the Metropolis sampling, we calculate averages and error 
estimates from multiple independent populations. 

Fig. 5 shows a branching-algorithm sampling of (n) = (d^a) and (X) — 
((a + a^)/2) for a real-gauge simulation of the anharmonic oscillator with 
n = 100. Again we see an enormous improvement in the sampling compared 
with stochastic sampling, Fig. 2. A clear advantage of the branching algorithm 
over Metropolis is that it generates physical moments at every time step as 
opposed to being targeted to a single time. 

At each time there are, on average, 10^ stochastic trajectories contributing 
to the stochastic averages. As with the Metropolis, it was again a matter 
of experimenting with different ratios of the number of trajectories to the 
number of populations to determine the best balance. We present results for 
the same division of trajectories amongst populations (10^ populations of 10^ 
trajectories) as samples amongst Markov chains for the Metropolis algorithm 
so that the results are as comparable as possible. 

The only free parameter in the algorithm itself is the time between branching 
events. Art. There is a trade-off between making this interval large enough 
that the weights spread out sufficiently for the branching to be meaningful and 
small enough to improve the sampling continuously in time. The branching 
interval used for this simulation. Art = 10^^, is small on the scale of the 
dynamics of the system but large enough to allow the weights to diverge 
significantly between branching events. 
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Fig. 5. Branching- algorithm sampling of real-gauge simulation of anharmonic oscil- 
lator dynamics for an initial coherent state with mean atom number n = 100 {A = 2, 
A = 1/2). (a) Mean atom number, (b) mean X— quadrature. Branching averages 
and error estimates, indicated by the shaded region, were calculated from the means 
of 10^ populations, each containing 10^ trajectories on average. The total simulation 
time was r = = 0.1 and the time step used was At = r/10^ = 10~^. The 

branching time was Arb = 10""^. The analytic results are shown as dotted lines. 



6 Conclusions and Outlook 



In this paper we have demonstrated the use of two Monte Carlo techniques, 
the Metropolis algorithm and a branching algorithm, for real-time calculations 
of quantum dynamics with the stochastic-gauge method. This work should 
be considered a proof of principle rather than a fully optimised 'recipe'. It 
is part of a larger program of optimising bases, gauges and algorithms for 
stochastic simulations of quantum dynamics in real and imaginary time. A 
timely application of these methods is in theoretical calculations for ultra-cold 
atomic gases [35,17]. QMC methods have been used to calculate some static 
properties of ultra-cold gases, e.g. see [36,37]. These systems are quantum- 
many body by nature and hence few exact theoretical results exist. They are 
an ideal testing ground for theory due to their purity and well-understood 
controllable interactions. 

In this work wc have restricted ourselves to real gauges so that the weights 
remain real and can be interpreted as probabilities, and have considered the 
single-mode anharmonic oscillator as an example of our methods. In order 
to control the divergence of stochastic trajectories using real gauges we have 
explored a previously untested gauge freedom resulting from the choice of a 
non-square noise matrix. The resulting distribution of weights becomes highly 
skewed on a time scale proportional to the inverse of the mean atom number. 
The weight parameter for most stochastic trajectories tends towards zero. 
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whereas very few tend towards a large weight. Such distributions are hkely 
to be ubiquitous for unitary real-time stochastic-gauge simulations with real 
gauges, as the weight distribution necessary broadens with time while the 
mean weight must remain unity. 

While such skewed distributions are difficult to sample with the usual stochas- 
tic methods, they are ideally suited to Monte Carlo importance sampling tech- 
niques that preferentially sample high-weight trajectories. Indeed we found 
enormous improvements over stochastic sampling using both the Metropolis 
and branching algorithms. The branching algorithm is the more straightfor- 
ward to apply because it has only one free parameter (the branching interval) 
and produces results at every time step. We suggest it as the best starting point 
for future Monte Carlo simulations. By contrast the Metropolis algorithm has 
to be targeted to a particular time and so seems less useful. However, there 
is a lot more freedom in the Metropolis algorithm and a vast literature exists 
on optimising sampling for particular distribution. Thus it seems quite pos- 
sible that it will be better suited to some problems, especially when further 
improvements over the branching algorithm are desirable. 

Traditionally Monte Carlo techniques have been highly successful in imaginary- 
time calculations for thermal equilibrium. This paper has extended the use of 
these techniques to real-time quantum-dynamical calculations and thus opens 
a new domain of application for these algorithms. In future, these techniques 
need to be extended to many-mode, many-particle problems where exact so- 
lutions are not known. 
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